{output: false} According to the instructions of leon, we need a 10 slide presentations so we could have the following structure:
Question: Which genes are differentially expressed in different subtypes of cancer
General wokflow
Considerations about the data:
- Excluded men as they represented a very small portion of the study
- We removed uninteresting genes (variance across samples of 0)
- Removed NAs and merged metadata with counts data
Male-female ratio
Male-female ratio
We can see that many of the differentially expressed genes are on the left side of the volcano plot.
Here is an analysis of PCA plots showing the scree and cumulative variance explained.
The high dimentionality required to explain 85% of the variablity of the data shows that cancer is a difficult task
Overlapped clustering of patients
The clusters from patients with and without tumor overlap, meaning the PCA is not separating clearly the clusters.